NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Empirical Privacy Variance

Hu, Yuzheng; Wu, Fan; Xian, Ruicheng; Liu, Yuhang; Zakynthinou, Lydia; Kamath, Pritish; Zhang, Chiyuan; Forsyth, David (July 2025, openreview.net)

Free, publicly-accessible full text available July 19, 2026
How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions

https://doi.org/10.1109/CVPR52734.2025.00659

Prakash, Aditya; Lundell, Benjamin; Andreychuk, Dmitry; Forsyth, David; Gupta, Saurabh; Sawhney, Harpreet (June 2025, IEEE)

We tackle the novel problem of predicting 3D hand motion and contact maps (or Interaction Trajectories) given a single RGB view, action text, and a 3D contact point on the object as input. Our approach consists of (1) Interaction Codebook: a VQVAE model to learn a latent codebook of hand poses and contact points, effectively tokenizing interaction trajectories, (2) Interaction Predictor: a transformer-decoder module to predict the interaction trajectory from test time inputs by using an indexer module to retrieve a latent affordance from the learned codebook. To train our model, we develop a data engine that extracts 3D hand poses and contact trajectories from the diverse HoloAssist dataset. We evaluate our model on a benchmark that is 2.5-10X larger than existing works, in terms of diversity of objects and interactions observed, and test for generalization of the model across object categories, action categories, tasks, and scenes. Experimental results show the effectiveness of our approach over transformer & diffusion baselines across all settings.
more » « less
Free, publicly-accessible full text available June 10, 2026
Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors

Zhong, Yuanyi; Bhattad, Anand; Wang, YuXiong; Forsyth, David (October 2024, IEEE International Conference on Computer Vision Workshops)

Full Text Available
StyleGAN knows Normal, Depth, Albedo, and More

Bhattad, Anand; McKee, Anand; Hoiem, Derek; Forsyth, David (December 2023, Advances in neural information processing systems)

Intrinsic images, in the original sense, are image-like maps of scene properties like depth, normal, albedo, or shading. This paper demonstrates that StyleGAN can easily be induced to produce intrinsic images. The procedure is straightforward. We show that if StyleGAN produces G ( w ) from latent w , then for each type of intrinsic image, there is a fixed offset d c so that G ( w + d c ) is that type of intrinsic image for G ( w ) . Here d c is {\em independent of w }. The StyleGAN we used was pretrained by others, so this property is not some accident of our training regime. We show that there are image transformations StyleGAN will {\em not} produce in this fashion, so StyleGAN is not a generic image regression engine. It is conceptually exciting that an image generator should know'' and represent intrinsic images. There may also be practical advantages to using a generative model to produce intrinsic images. The intrinsic images obtained from StyleGAN compare well both qualitatively and quantitatively with those obtained by using SOTA image regression techniques; but StyleGAN's intrinsic images are robust to relighting effects, unlike SOTA methods.
more » « less
Full Text Available
Sim-on-Wheels: Physical World in the Loop Simulation for Self-Driving

https://doi.org/10.1109/LRA.2023.3325689

Shen, Yuan; Chandaka, Bhargav; Lin, Zhi-Hao; Zhai, Albert; Cui, Hang; Forsyth, David; Wang, Shenlong (December 2023, IEEE Robotics and Automation Letters)

Full Text Available

Search for: All records